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Abstract — Since last 10 years various methods have been used 
for ear recognition. This paper describes the automatic 
localization of an ear and its segmentation from the side pose 
of face image. In this paper, authors have proposed a novel 
approach of feature extraction of iris image using 2D Dual 
Tree Complex Wavelet Transform (2D-DT-CWT) which 
provides six sub-bands in 06 different orientations, against 3 
orientations in DWT. DT-CWT being complex it exhibit the 
property of shift invariance. Ear features vectors are obtained 
by computing mean, standard deviation, energy and entropy 
of these six sub-bands DT-CWT and three sub-bands of DWT 
Canberra distance and Euclidian distance are used for 
matching. The accuracy of recognition is achieved above 97 
%. 

Keywords- Ear recognition; ear detection; ear biometrics; DT- 
CWT; complex wavelet transform 

I. Introduction 

Ear recognition has received considerably less attention 
than many alternative biometrics, including face, fingerprint 
and iris recognition. Ear-based recognition is of particular 
interest because it is non-invasive, and because it is not 
affected by environmental factors such as mood, health, and 
clothing. Also, the appearance of the auricle (outer ear) is 
relatively unaffected by aging, making it better suited for 
long-term identification[l]. Ear images can be easily taken 
from a distance without knowledge of the person concerned. 
Therefore ear biometric is suitable of surveillance, security, 
access control and monitoring applications. 

As compared to face biometrics [2] -[4] ears have several 
advantages over complete faces, like, reduced spatial 
resolution, a more uniform distribution of color, and less 
variability with expressions and orientation of the face. Its 
deep three-dimensional structure makes it very difficult to 
counterfeit. In face recognition there can be problems of 
illumination variation, pose variation and facial 
expressions[4].Ear was first used for recognition of human 
being by Iannarelli who used manual techniques to identify 
ear images. The medical literature provides information that 
ear growth is proportional after first four months of birth and 
changes are not noticeable in the age 8 to 70 [ 1 ] .The remainder 
of this paper consists of: existing ear recognition techniques, 
localiasition and normalistion of ear, feature extraction using 
DT-CWT, matching, experimental results and conclusions 
covered in Section-2, 3, 4, 5, 6 and 7 respectively. 
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II. Existing Ear Recognition Techniques 

Major work on automatic ear localistion [2]-[ll] has been 
done recently in past 10 years. Automatic ear recognition 
using Voronoi diagrams to take care of adverse effects of 
lighting, shadowing and occlusion has been presented by 
Burge and Burger [6]. In [9], Active Contour Model (or 
Snakes) is used to segment the ear from the side images of 
the face. Hurley, Nixon and Carter [7] have used force field 
transformations for ear localisation. [8][12] and [13] make 
use of 3-D range images to extract the ear from the image of a 
human. However, the tougher challenge is to detect the ear 
from an intensity image. A shape model-based technique for 
locating human ears in side face range images is proposed in 
[8]. In this method, the ear shape model is represented by a 
set of discrete 3D vertices corresponding to ear helix and 
anti-helix parts. Ansari and Gupta have proposed an approach 
based on edges of outer ear helices by exploiting the 
parallelism between the outer helix curves of the ear to localize 
the ear[10]. Skin-color and contour information has been 
exploited for ear detection by Yuan and Mu [11]. In [13], 
authors has presented a distance transform and template 
based technique for automatic ear localization from a side 
face image. The technique first segments skin and non-skin 
regions in the face and then uses template based approach 
to find the ear location within the skin regions. 

Victor et al. [4] and Chang et al. [2] have researched use of 
PCA and FETET for ear recognition. Moreno et al. [5] used 
2D intensity images of ears with three neural net approaches 
for ear recognition. In [16], Anupama Sana et al. presented 
an ear biometric system based on discrete Haar Wavelet 
Transform whereas Wang and Yuan [17] used Gabor wavelets 
and general discernment analysis. Wang Xiaoyun et al. [19] 
proposed block segmentation based approach whereas 
modular neural network architecture has been proposed by 
Gutierrez et al. [18]. 

III. Proposed System 
The block diagram of proposed system is shown in Fig 1. 
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Figure 1. Block diagram of proposed system 
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It consist of image acquisition module, preprocessing and 
automatic ear localization module, DT-CWT based feature 
extraction module and matching module. 

A. Image Acquisition Module 

SonyDSC-HXl (15 Megapixel and optical zoom of20x) camera 
is used for image acquisition. Database of 240 images of 40 
subjects for left and right ears is crated at MCTE and UND 
database is also used. 

B. Preprocessing and ear localisation module 

The raw image is not suitable for feature extraction due to its 
large background thus some pre-processing is required to 
make it suitable. 

C. Feature extraction module 

After successful ear localization, features are extracted using 
DT-CWT. The details of it is stated in section IV. 

D. Matching module 

Energy, Entropy, Mean and Standard Deviation of each sub- 
bands of DT-CWT is calculated to create a feature vector. 
Euclidian distance and Canberra distance are used as 
similarity measure for matching the feature vectors of test 
image with that of images stored in database(l:N) match. 

IV. Automatic Ear Localisation 

This was a very challenging task as most of the work 
carried out on this aspect is in experimental stage. The 
algorithm so designed includes the finer points of various 
algorithms and additional measures to try and further enhance 
and improve the ear localization results. The algorithm works 
as under. 

(i) Take a side face image of an individual (under varied 
background and lighting conditions). 

(ii) Since RGB representation of color images is not suitable 
for characterizing skin-color, it first converts the RGB color 
space to chromatic color space, and then uses the chromatic 
color information for further processing. 

(iii) Apply threshold to the resultant image and convert to 
binary. 

(iv) A few results under varied background conditions are 
shown in Figure 2 to Figure 4. 




Figure 2. Side face image with open sky background 




Figure 3. Side face image with class room background 
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Figure 4. Side face image with class room background from UND 
database 

(v) Once the background is removed from the image we 
determine the nose tip of the subject and assuming an 
approximate distance between the subject's nose tip and ear 
pit and the average size of human ear we crop the ear part 
from the side face of the image. 

(vi) Thereafter, we take a four pronged approach to determine 
the edge variations from top (at Helix), from side (at Concha 
and at Helix posterior) and from bottom (at Lobulo) to 
determine the ear edges and thus crop out the exact ear out 
of the image. 

(vii) A two resulting ear images localized automatically from 
the side face images are shown in Figure 5. 




Figure 5. Automatic cropping for Ear localization 

(viii) The cropped ear images may be of varying sizes so the 
feature set of images may also vary. Hence the images are 
normalized to a constant size. 



V. Feature Extraction 

DT-CWT is formulated by Kingbury and Selesnick[20], [21] 
using two trees (real and imaginary tree) of DWTs with different 
filter real coefficients for imaginary tree filters designed from the 
coefficients of real tree filters to overcome the limitations of 
DWTs. The details of DT-CWT and feature extraction are stated 
in following subsections. 

A. Dual Tree Complex Wavelet Transform 

In dual-tree, two real wavelet trees are used as shown 
in Figure 6, each capable of perfect reconstruction (PR). 
One tree generates the real part of the transform and the 
other is used in generating complex part[20]. As shown, 
{HO (z), HI (z)} is a Quadrature Mirror Filter (QMF) pair in 
the real-coefficient analysis branch. For the complex part, 
{GO (z), Gl (z)} is another QMF pair in the analysis 
branch. All filter pairs are orthogonal and real-valued. 

vc ACEEE 



ACEEE Int. J. on Information Technology, Vol. 02, No. 02, April 2012 



It has been shown [2 1 ] that if filters in both trees be made 
to be offset by half-sample, two wavelets satisfy Hilbert 
transform pair condition and an approximately analytic 
wavelet is given by Eq( 1). 
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Figure 6. Selesnick's Dual Tree DWT 

Thus, if G {a) = H (co)x e - mm) and 6{a) = to/2 
then y/ g (a)) = -jy/ h (co),co>0 

^jy/ h (ct)),c0<Q (2) 

FromEq(l) and (2), low pass filters after the first stage and at 
first stage respectively are given by Eq(3): 

g (n) = h (n-0.5) and 

g ( W ) = h (n -1) (3) 
Similar relations also hold true for high pass filters of both 
the trees.. 

In this algorithm, (10,10)-Tap near orthogonal wavelet 
filters are used in first stage and 'db7' filters are used for 
higher stages in the real tree (i.e. h and h )[20] .The imaginary 
low pass filter is derived from the above half sample delayed 
condition. 

The high pass filter is the quadrature-mirror filter of the 
low pass filter. The reconstruction filters are obtained by 
time reversal of decomposition filters. All the filters used are 
of same length based on Selesnick's approach [20], [21], [23], 
[24] unlike Kingsbury's approach. 

The 2D separable DWT can be written in terms of ID 
scaling functions (cp) and wavelet functions(v|/) as: 



^(ij)^Wr(j) 



(4) 



Oriented non-separable 2D wavelet transform is derived 
by combining the sub-bands of two separable 2D DWTs. 
The pair of conjugate filters are applied to two dimensions 
(x and y), which can be expressed by Eq(5) as given bellow: 

(K + is JO,- + jg y ) = (Mv - S.gy) + i(h,g y + h ygx ) (5) 
The filter bank structure of 2D DT CWT, to implement 
Eq(5) is shown in Figure 7. 
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Figure 7. Filter bank structure of 2D DT CWT 

Tree-a and Tree-b is combined to compute the Real part 
ofEq(5)i.eReal(2DDWT) tree of CWT as shown in Figure 
8. Similarly, Imaginary (2D DWT) tree of CWT can be obtained 
from tree-c and tree-d i.e. (h x g y - g x h y ), as per Eq(5). 
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Figure 8. Formation of Real Tree DT CWT 

Thus, the decomposition for each mode is performed in a 
standalone mode, in one after another stage i.e. total of 6 
detailed coefficients are derived at each stage; three for the 
real tree and three for the imaginary tree. 3-stage 
decomposition is performed. At each stage, coefficients are 
oriented towards their respective directions as stated in Eq(4). 
Following six wavelets, as given by Eq(6), are used to obtain 
oriented 2-D separable wavelets [20]: 



fuf-i. >■) = 0„U)y„(>'X V2,i(*. >') = g U)w g (y)> 
ViA x < y) = V h( x )<l>k(y)> fuU.J) = ¥ g (x)^) g (y), 

Vi. 3 (*. >■) = ¥hW¥h(.y)< ViA x > >') = ¥,(x)4 g (.y)> 



(6) 



where, \|/ correspond to the coefficients derived from the 
real tree and v|/ 7 correspond to the coefficients derived from 
the imaginary tree. They can be combined by Eq(7) to form 
complex wavelet coefficients. 



(7) 



Normalization by i / VF is used so that the sum difference 
operation constitutes an ortho-normality. These six wavelet 
sub-bands of the 2-D DT-CWT are strongly oriented in 
{+15°,+45°,+75°, -15°, -45°, -75° } direction as shown in fig(5) 
by red lines and it captures image information in those 
directions. Thus, in particular, 2D dual-tree wavelets are not 
only approximately analytic but also oriented and are shift 
invariant because of its analytic structure[20].The impulse 
responses of three 2-D sub-bands (2-D non separable filters 
for detailed coefficients) 
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of DWT and six sub-bands (2-D non separable filters for 
detailed coefficients) of DT-CWT are shown in Figure 9. 

B. Feature Extraction 

Ear analysis using DWT provides singularities (edges) 
in only three directions (0, 45, 90) and without phase 
information which is improved by finding the singularities, 
with phase information, in six directions (0,+/-15, +/-30, +/-45, 
+/-60, +/-75, 90) and at many freq bands using DT-CWT to 
achieve shift invariant features for better accuracy and 
efficiency at less computational cost as compared to existing 
methods. 

From the detailed study of prevalent techniques already 
employed for ear recognition, it it is realized that nobody had 
made use of Complex Wavelets for ear recognition. This 
realization laid the foundation of utilizing this approach to 
determine whether or not the said approach can further 
enhance and improve the recognition rates already achieved 
by other methods. As the advantages of using CWT vis-a- 
vis DWT, it is imperative to employ Dual Tree - Complex 
Wavelet Transform (Selesnick) (DT-CWT(S)) for this work. 

The DT-CWT(S) algorithm is used to design and 
implement the Dual Tree structure (up to Level 2) using 
MATLAB, employing first stage and second stages low pass 
and high pass filter coefficients given by Selesnick. The 
impulse responses of three 2-D sub-bands (2-D non separable 
filters for detailed coefficients) of DWT and six sub-bands 
(2-D non separable filters for detailed coefficients) of DT- 
CWT are shown in Figure 9. 




Figure 9. Impulse responses of sub-bands of DWT and DT-CWT. 

DT-CWT has 06 directional wavelets oriented at angles 
of +15, +45, +75 in 2-Dimension. We get these six directions 
by passing the 2-D signal (image) through the real tree 
structure using the filter coefficients of both real and 
imaginary trees. The wavelet coefficients of each image which 
formed part of the Training Database were thus obtained in 
all the six directions and stored for further matching and 
testing. These directions can be seen clearly from the Figure 
10 which represents the Level 1 decomposition of an image. 
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Figure 10. Real and imaginary tree wavelet sub-band images 
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VI. Experimental Results 

All the Training images of both the databases (MCTE 
database of 240 images of 40 subjects for right and left ears 
and UND database of 219 subjects under J-collection and G- 
collection) are processed and their respective wavelet 
coefficients at Level 1 and Level 2 are calculated. Energy, 
Entropy, Mean and Standard Deviation of each image's 
wavelet coefficient are then calculated and stored in an MS 
Access database. Thereafter images from the Test Set and 
random images were matched with these stored values using 
Euclidean and Canberra distance matching techniques and 
results for False Acceptance Rate (FAR), False Rejection Rate 
(FRR), Equal Error Rate (EER) and Receiver's Operating Curve 
(ROC) compiled at various thresholds. All the results are stated 
in Table 1 . 

Figure 1 1 to 14 shows the FAR, FRR and ROC of best and 
worst case of Canberra distance and best and worst case of 
Euclidian distance when tested on database with following 
details. 

Name Of Data-Base : UND - Collection G 
No of Images in the Training database: 20 
No of Images in Test Database: 90 
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Figure 1 1 . Results using Canberra distance and feature vector of 
energy only (Worst case of Canberra Distance) 
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Figure 12. Results using Canberra distance and feature vector of 
energy + std deviation + entropy (Best case of Canberra Distance) 
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Table I. Compiled results of Avg FAR, Avg FRR and Avg Recognition rate using Canberra and Euclidian distance for different feature vectors 
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Figure 13. Results using Euclidian distance and feature vector of Std 
Deviation only (Worst case of Euclidian Distance) 




aji Km ii.li 



Figure 14. Results using Euclidian distance and feature vector of 
energy + std deviation + entropy (Best case of Euclidian Distance) 

The maximum recognition rate of 8 1 % is obtained when 
DWT is used for feature extraction and Canberra distance is 
used as similarity metric for combined vector of energy, std. 
deviation and entropy. FAR, FRR and ROC for it is shown in 
figure 15. 
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Figure 15. Results of DWT using Canberra distance and feature 
vector of energy + std deviation + entropy (Best case of DWT) 

Conclusions 

The authors have introduced a new 2D DT CWT for ear 
recognition first time because of its ability to capture shift 
invariant features in 06 orientations. The experimental results 
have demonstrated the effectiveness of the proposed method 
in terms of improving the recognition rate. 

Canberra distance has shown better results than Euclidian 
distance because it normalises the individual feature 
components before finding the distance between the two 
images. 

The best recognition rate of over 97% has been achieved 
using Canberra distance when feature vectors of energies, 
standard deviation and entropy of sub-bands of DT-CWT 
are used together. 

The authors are working on improving the recognition 
rate by using the RCWF [22], [23], [24] in combination with 
DT-CWT to obtain features in 12 orientations (06 by DT- 
CWT and 06 by RCWF). 
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